Twitter Home Page of WeRateDogs

image.png

MY DATA WRANGLING ACT REPORT

Submitted by Opara-Eze Tochi

See one of the beautiful moments on the WeRateDogs twitter page

"This is Miya and Kiyo. You never know when someone might need a hug, so Miya tries to be proactive about it. Kiyo stays available for quality control. 13/10 for both" - Image Source

image.png

I had the opportunity to experience the entire data analysis process for the data wrangling project on the Udacity Data Analyst Nanodegree, from gathering the data to cleaning and analyzing it to eventually showing trends from the data. The information was gathered from the Twitter account "WeRateDogs," which gives most dogs a rating of at least 10.

WeRateDogs is a Twitter account that rates people's dogs with a humorous comment about the dog. The account was started in 2015 by college student Matt Nelson, and has received international media attention both for its popularity and for the attention drawn to social media copyright law when it was suspended by Twitter for breaking these aforementioned laws - Wikipedia

The denominator of these scores is almost always 10. however, the numerators? frequently more than 10. 11/10, 12/10, 13/10, etc. Why? because "Brent, they're good dogs." Over 9.2 million people follow WeRateDogs as at the time of preparing this report, and it has been featured in international media.

What are we to do with these ratings then? Which dog is the most well-known among dogs has to be the finest common query? Can we find a connection between favorites, ratings, and tweets? Which dogs would have gotten the worse scores? In order to get the answers to these questions, I combed through the WeRateDogs twitter data and conducted an analysis. I was able to create some beautiful visualizations for my study with the aid of Python libraries like Pandas, Matplotlib and Seaborn

Some Insights and Visualization

1 Is there a correlation between the retweet_count and favourite_count columns

Yes! As seen in the chart below, the trendline shows a positve correlation between the number of retweets and favourite

image-2.png

2 Let us visually find out which is the most favourite dog stage among the audience of '@WeRateDogs'

image.png

Puppo was the most favourite dog stage among the audience of WeRateDogs

Golden_retriever leads, followed by Labrador_retriever in prediction 1

image-3.png

Labrador_retriever leads, followed by Golden_retriever in prediction 2

image-2.png

Labrador_retriever leads, followed by Chihuahua in prediction 3

image.png

Other interesting insights from the analysis also showed that dogs in the puppo stage had the most retweets and favourites sum

Conclusion

It is critical to conduct an initial analysis to identify any and all data errors with the data set. This analysis aids in the subsequent planning and comprehension of the data collection, as well as the optimization of the data cleaning process.

For more details, check out my GitHub ✌🏾